Refactor manual tests job: extract reset logic and separate infinite block tests by nhorton · Pull Request #119 · Unsupervisedcom/deepwork

nhorton · 2026-01-22T20:54:44Z

Summary

This PR refactors the manual_tests job to improve maintainability and test organization by extracting common reset logic into a reusable step and moving infinite block tests into a dedicated serial step.

Key Changes

Version bump: Updated from 1.2.1 to 1.3.0
New reset step: Created steps/reset.md containing centralized reset instructions that other steps can call internally
- Consolidates git reset, file cleanup, and queue clearing logic
- Eliminates duplication across test steps
- Provides clear documentation of reset procedures
New infinite block tests step: Created steps/infinite_block_tests.md as a dedicated step for infinite block testing
- Moved 4 infinite block tests (2 prompt-based, 2 command-based) from run_fire_tests
- Tests both "should fire" (no promise) and "should NOT fire" (with promise) scenarios
- Runs serially with resets between tests due to blocking nature
Updated run_not_fire_tests.md:
- Reduced from 8 to 6 tests (removed infinite block tests)
- Added explicit sub-agent configuration requirements: model: "haiku" and max_turns: 5
- Updated quality criteria to reference reset step instead of inline commands
Updated run_fire_tests.md:
- Reduced from 8 to 6 tests (removed infinite block tests)
- Added explicit sub-agent configuration requirements: model: "haiku" and max_turns: 5
- Updated quality criteria to reference reset step instead of inline commands
- Clarified that infinite block tests are handled separately
Updated job.yml:
- Added sub-agent configuration guidance in description
- Updated step descriptions to reflect test count changes
- Added new infinite_block_tests step to workflow
- Updated changelog with version 1.3.0 entry

Implementation Details

All test steps now reference the reset step for cleanup procedures, reducing documentation duplication
Sub-agent configuration (model: "haiku", max_turns: 5) is now explicitly documented in all test steps
Infinite block tests are isolated in their own step to allow for proper serial execution and controlled observation of blocking behavior
Quality criteria have been updated across all steps to be more consistent and reference the reset step
The reset step includes detailed explanation of each command and when to use it

…ck tests - Add `model: "haiku"` and `max_turns: 5` config for all sub-agents to minimize cost/latency and prevent indefinite hangs - Move infinite block tests (prompt and command) to dedicated serial step with both should-fire and should-not-fire scenarios - Extract reset instructions to reusable reset.md step that other steps reference internally - Reduce parallel tests from 8 to 6, serial tests from 8 to 6 - Bump version to 1.3.0

Auto-generated by `deepwork install`: - Added skills for new infinite_block_tests and reset steps - Updated existing step skills with new configuration

- Reset step now runs as a dependency before run_not_fire_tests to ensure clean environment before any tests begin - "Should NOT fire" tests now verify the rules queue is empty after sub-agents complete, confirming rules truly didn't fire - Update job description to reflect 4-step flow with reset first - Bump version to 1.4.0

- Update Tests 3 & 4 to specify dual criteria: should fire AND should return in reasonable time (via max_turns limit) - Add "Returned in Time?" column to results tracking table - Note that Task tool has no direct timeout, so max_turns is the safeguard against infinite hanging - Update quality criteria to separately verify "should NOT fire" and "should fire" test behaviors

- Simplify reset step to single criterion (environment clean) - Update infinite_block_tests to include "returned in reasonable time" criterion for no-promise tests - Keep verbose criteria for run_not_fire_tests and run_fire_tests

claude added 7 commits January 22, 2026 20:53

Sync generated skills for manual_tests job v1.3.0

8df8a3c

Auto-generated by `deepwork install`: - Added skills for new infinite_block_tests and reset steps - Updated existing step skills with new configuration

Consolidate changelog entries into single 1.4.0 entry

9f6d9dd

Use version 1.3.0 for consolidated changelog

0f21ab6

nhorton merged commit cc4ff7e into main Jan 22, 2026
4 checks passed

nhorton deleted the claude/update-manual-tests-job-cYkyd branch January 22, 2026 21:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor manual tests job: extract reset logic and separate infinite block tests#119

Refactor manual tests job: extract reset logic and separate infinite block tests#119
nhorton merged 7 commits intomainfrom
claude/update-manual-tests-job-cYkyd

nhorton commented Jan 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nhorton commented Jan 22, 2026

Summary

Key Changes

Implementation Details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants